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ABSTRACT 

Quantitative use of satellite-derived maps of monthly rainfall requires some mea- 
sure of the accuracy of the satellite estimates. The rainfall estimate for a given map 
grid box is subject to both remote-sensing error and, in the case of low-orbiting satel- 
lites, sampling error due to the limited number of observations of the grid box provided 
by the satellite. A simple model of rain behavior predicts that rms random error in 
grid-box averages should depend in a simple way on the local average rain rate, and 
the predicted behavior has been seen in simulations using surface rain-gauge and radar 
data. This relationship was examined using satellite SSM/I data obtained over the west- 
ern equatorial Pacific during TOGA COARE. RMS error inferred directly from SSM/I 
rainfall estimates was found to be larger than predicted from surface data, and to de- 
pend less on local rain rate than was predicted. Preliminary examination of TRMM mi- 
crowave estimates shows better agreement with surface data. A simple method of esti- 
mating rms error in satellite rainfall estimates is suggested, based on quantities that can 
be directly computed from the satellite data. 
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1. Introduction 

Satellite data axe now regularly used to produce gridded maps of rainfall 
averaged over time intervals ranging from hours to many months. It has not been easy, 
however, to provide accompanying quantitative estimates of the accuracies of the grid- 
point averages. This is in part because remote-sensing techniques do not yet provide 
sufficient information to allow unambiguous conversion of measurements into rain-rate 
values for the observed area, and the distribution of errors introduced in the conversion 
depends on the observed situation in ways that are not always known. The problem is 
exacerbated by the highly intermittent character of rain, which makes averages of rain 
data noisy and comparison of remote-sensing results with measurements made on the 
ground difficult. 

The Tropical Rainfall Measuring Mission (TRMM) satellite was launched in 1997. 
Descriptions of TRMM are given by Simpson et al. (1988, 1996) and Kummerow et 
al. (1998). One of the primary goals of the mission is to provide rain data sufficiently 
accurate that TRMM satellite products can serve as a kind of transfer standard to 
calibrate rain estimates from other satellite systems and thereby improve the overall 
accuracy of global rain maps. To help reach this goal, the satellite carries several 
instruments on board including a precipitation radar and a passive microwave sensor, 
the latter having higher resolution than most satellite-borne microwave instruments. 

An important component of the effort towards reaching this goal is developing 
quantitative estimates of the accuracy of the gridded products of TRMM. A number 
of different approaches to this are being tried, including development of models for 
the error intrinsic to the remote sensing methods themselves; comparison of satellite 
products to ground-based measurements from rain-gauge arrays, radar sites, and aircraft 
measurements during field campaigns; and comparison with other satellite observations. 

Although much can be learned about sources of error in the TRMM rain 
estimates from examining individual overlapping coincident snapshots of rain events 
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taken by various TRMM instruments and by other satellites and ground-based 
observation systems, much can also be learned from comparisons among averages of 
satellite data and ground-based data. As long as the averages of the satellite estimates 
and ground-based or other-satellite estimates are taken from time intervals and spatial 
locations which are believed to have similar statistics, such averages allow enormously 
more data to be used in the comparisons than can be assembled from coincident 
observations. Comparisons of averages of data reveal biases in rain estimates. Such 
biases may be small compared to discrepancies found in point-by-point comparisons 
of coincident observations, yet knowledge of these biases is important when TRMM 
data are used as a transfer standard, and especially so when the data are used for 
climatological studies. 

One of the commonest methods of comparing satellite estimates of rainfall to 
ground-based observations and to other satellite estimates is to test the agreement 
of averages over a spatial domain, such as a grid box on a map, averaged over a 
sufficiently long time period that the averages are stable enough for the comparison 
to be informative. Even if the remote-sensing techniques are perfectly accurate, such 
averages will contain sampling error because the systems are not measuring rainfall 
everywhere in the area at every moment. Rain gauges, for example, measure more or 
less continuously in time but cover very little of the area, whereas radar views irregular 
shaped volumes of the atmosphere at frequent but non-continuous intervals of time, 
and satellite observations are till more widely spaced in time. While averages from two 
different systems may disagree because of inherent errors in the measurement methods, 
they will almost certainly disagree because they contain different sampling errors. 

Mathematically, comparison of two grid-box averages can be formulated like this: 
Suppose a system X — TRMM, perhaps — gives an estimate Rx for the average rainfall 
R in a grid box over some time interval of the order of a month or so, and that system 
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Y — another satellite, perhaps, or a ground-based system — gives an estimate Ry for the 
same area and time. Each system makes an (unknown) error 

e a = R a -R; a = X,Y. (1.1) 

A portion of the error e a is due to possible algorithmic and instrumental errors in 
estimating rain rate when it is observed, or to differences in the mean rain rate observed 
by the two systems due to spatial or temporal inhomogeneity in the rain statistics, 
such as spatial variation in the mean rain rate, or a diurnal cycle. The rest is due to 
inadequate sampling. To compare the estimates of the two systems to see if one is 
biased relative to the other, one examines whether the difference Rx — Ry is bigger 
than can be explained by chance. A straightforward approach is typically to estimate 
the mean squared difference 

cr 2 = {{Rx - Ry f) 

= {{sx - £y) 2 ) 

= cr^ 2 + cry 2 — 2(exsy) , (1-2) 

where <r 2 = (e 2 ), and to estimate the bias as Rx — Ry ± 2<r. (The limits ±2<j would be 
appropriate if the difference Rx — Ry is normally distributed and 95% confidence limits 
axe wanted. If \Rx — Ry I > 2<r, one is fairly sure that there is a nonzero bias present. 
See, for example, Taylor (1997) for a discussion of such approaches.) 

The use of a this way to make quantitative inferences about the bias, however, 
assumes that the statistics of the differences Rx — Ry axe normally distributed, 
which cannot be taken for granted. Even when the assumption of normality is not 
completely justified, though, the above approach to inferring a bias is likely to be a good 
approximation to the correct one. A more satisfactory approach to this problem might 
be to collect enough data from the two systems so that the statistical distribution of the 
difference Rx — Ry itself could be established. Having obtained it, one could empirically 
determine confidence limits for the bias {Rx ~ Ry ) i where the angular brackets indicate 
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an average over an ensemble of datasets similar in nature to what has been collected. 

An approach like this would probably benefit from employing resampling techniques. 

See Zwiers (1990) and Wilks (1997), however, for important caveats concerning these 
methods. 

* 

The more conventional approach based on (1.2) has the advantage that it is 
easily automated and easy to apply to disparate regions of the world, time periods, and 
rainfall-estimation methods. Published results of studies done by various groups are 
often already cast in this format so that comparisons can be quickly made. It should 
also be noted that the error estimates a as defined above are exactly what are needed 
if a satellite map of rainfall is to be compared with climate models, whose output is 
generally in the form of grid-box averages. If the satellite map of R is assumed to be 
accurate to ±2<r^ for some grid box, and the climate model forecast to be accurate 
to ±2oy (due to predictability limits), the satellite map value and the model forecast 
should agree to within ±2 a computed from Eq. (1.2). 

The purpose of this paper is to explore methods of estimating ax for weekly to 
monthly averages of rain estimates obtained from microwave instruments on low earth- 
orbiting satellites, including those on TRMM and the Special Sensor Microwave/Imagers 
(SSM/I) on Defense-Meteorological-Satellite-Program (DMSP) satellites. Because 
raindrops interact strongly with microwave radiation, such instruments are believed 
to provide some of the best estimates of rain rates observed from satellite platforms. 
Sampling error contributes substantially to ax for these satellites because they observe 
any spot on the earth only a few times per day at best. It was originally argued (e.g., 
Wilheit 1988, Bell et al. 1990) that “retrieval errors,” the errors made in estimating 
actual rain rates from microwave measurements, might contribute relatively little to a x 
because the large number of fields of view (FOVs) averaged over in forming a monthly 
average would tend to produce relatively small net average retrieval error, even if 
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individual random retrieval errors were large. Results presented in this paper appear 
not to support this. 

The error ax is likely to depend on many aspects of rain in a given region, 
such as the amount and types of rainfall, the average synoptic conditions, the season, 
sea-surface temperatures, availability of moisture, levels of aerosol contaminants, etc., 
as well as on the sampling and observational characteristics of the satellite and its 
instruments. In a previous paper, Bell and Kundu (1996), hereafter abbreviated as 
BK96, derived a simple formula expressing the sampling error as a function of the 
mean rain rate and an “effective” number of samples. A more general argument for 
the same formula was subsequently developed by Bell and Kundu (2000), hereafter 
abbreviated BK00. They tested this formula using the sampling characteristics of the 
TRMM satellite and the statistical properties of a number of datasets from ground- 
based rain-gauge and radar measurements. In this paper, we continue the investigation 
by comparing the formula’s prediction of the behavior of nns error in monthly averages 
obtained from a satellite-derived dataset. 

The dataset studied here contains retrieved rain rates over the western tropical 
Pacific during the Tropical Ocean Global Atmosphere/Coupled Ocean Atmosphere 
Research Experiment (TOGA COARE), during November 1992 to February 1993. 

The rain rates are derived from SSM/I data taken from two DMSP satellites that were 
orbiting at the time, the F10 and Fll. The algorithm used in the retrievals is similar 
to but not so highly developed as the one presently being used for TRMM. Details will 
be given later. It is found that a fairly simple parameterization of the random error 
in monthly averages over 2.5° x 2.5° grid boxes seems to describe the data well, but 
that the dependence on the mean rain rate in the grid box is different from what was 
predicted by the model and observed using ground-based data as summarized in BK00, 
and the error magnitudes are much higher. 


- 7 - 


The source of this difference appears to be the very different responses of the 
satellite microwave instruments and algorithm to the presence of stratiform rain when 
compared with the ground-based measurements. This explanation will be discussed in 
a separate paper. Such a rain-type-dependent response has important implications for 
using one satellite estimate to calibrate another, as is sometimes done in combining 
datasets to produce global maps of rainfall, or in comparing satellite estimates to 
ground- validation datasets. 

Despite the differences observed here in the random error of satellite averages 
compared with that of ground-based averages, the approach can still be used to 
obtain parameterized estimates of ax as a function of the average rain rate in a 
grid box, and thus can be used to supply fairly simple descriptions of the confidence 
levels to be applied to each grid-box value of rain rate generated from the satellite 
data. Comparisons of satellite estimates against values obtained from ground-based 
instruments can therefore be carried out using Eq. (1.2), provided the sampling error 
<7y in the ground-based estimates can be obtained and the covariance term {ex £ y) 
estimated. In many instances the covariance term can probably be neglected, either 
because it is actually small or because it will tend to decrease <7, so that ignoring it will 
mean that a is at worst overestimated and the error bars will therefore be conservatively 
estimated. 

In the following section we briefly review a model for how sampling error should 
depend on rain rate and other factors and how sampling error estimates obtained with 
ground-based data compare with the simple model. We describe the SSM/I-derived 
dataset from which estimates of the random error in SSM/I monthly averages are 
obtained in section 3, and in section 4 compare the estimates with estimates made from 
surface radar taken in tropical oceanic environments. The SSM/I statistics display a 
simple power-law dependence on local rain rate, and these power laws are described 
in section 5. In section 6 we report some preliminary results on rainfall statistics 
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observed by TRMM and compare and contrast them with the results from the SSM/I 
observations. Section 7 summarizes our results and gives some concluding remarks. 

Some statistical and computational details are provided in an appendix. 

2. Review of a simple model for sampling error 

A simple theoretical model presented in BKOO suggests how sampling error might 
depend on the rainfall climatology and satellite sampling characteristics for a given 
grid box. For the reader’s convenience and to establish notation we briefly review the 
formula and the underlying concepts and definitions. For the detailed derivations see 
BKOO. 

a. Definitions 

We are interested in an estimate of the space-time-averaged rain rate 

fi = (l/T) FdtR A (t), (2.1) 

J u 

where 

Ra{1) — (1/A) f d?xR(x,t ) (2.2) 

J A 

is the area-averaged instantaneous rain rate, R(x, t ) is the local rain rate at the point 
x at time t, T is the averaging period, taken here to be one month, and A is the area 
of the grid box. We assume A to be large enough so that the rain rates in neighboring 
boxes can be assumed to be statistically uncorrelated to a good approximation. 

The satellite in general views a grid box intermittently and even then sometimes 
only partially. Thus the instrument provides an estimate Ri of the rain rate at times 
{tj,i = 1, . . . ,n} averaged over an area < A corresponding to the region of overlap 
between the grid box and the instrument swath during the overpass at time ti. The 
satellite estimate R of the true monthly average R is obtained as a weighted average of 
the individual estimates Ri : 

t 2 - 3 ) 

n *ti 
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(2.4) 


with suitably chosen weights u>i normalized to 

(1 /n)J2 w i = 1 • 

i=l 

(The estimate R would be an example of R x in Eq. 1.1.) A convenient way to 
obtain R directly from the data is to average the rain-rate estimates from all the 
instrument footprints that fall within the area A over the period T. (If the footprints 
are distributed relatively uniformly over the areas Ai then such an average is equivalent 
to setting W{ oc Ai/ A. If the footprints are nonuniformly distributed but the area 
average Ri has been corrected for this, the same choice for Wi is appropriate. It is shown 
in BK96 that this choice of weights provides a near-optimal estimate of R for most grid 
boxes seen by TRMM except those at the highest latitudes.) 

The uncertainty in the estimate R is measured by the mean squared error 

= ((B - Rf) , (2.5) 

where the angular brackets denote an average over an ensemble of rain scenarios 
consistent with the local rainfall climatology. In general, as discussed in BKOO, 
contains contributions from both the sampling error arising from intermittent satellite 
coverage and the retrieval error arising from the errors in converting the results of 
measurements into actual rain rates. If we can assume that the retrieval errors are 
uncorrelated from footprint to footprint, the contribution of these errors to R tends 
to be small (Wilheit 1988; Bell et al. 1990), and the total error <7g is dominated by 
the sampling error component. If the contribution to cr\ from retrieval errors can be 
neglected, the satellite estimates Ri for each overpass can be treated statistically as if 
they were exact. In Eq. (2.3) we can then set Ri = Ri = Ra^U) and compute the 
sampling error component using (2.5). We will return to these assumptions later. 

As we have already mentioned, the sampling error can depend on the local 
rainfall statistics as well as sampling characteristics of the satellite. A simple model 
for this dependence is based on the straightforward assumption that variations in 
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the total rainfall amount in an area are primarily due to variations in the number of 
independently evolving precipitating systems present within it rather than variations 
in the intensity of the individual systems. Such an assumption is present in almost 
all statistical treatments of rainfall, and some such assumption can be used to justify 
rain algorithms that estimate areal rainfall from areal coverage. The assumption is 
dynamically plausible because the convective cores of storms are quickly evolving small- 
scale phenomena, limited in their development by local lapse rates and the availability of 
moisture. Synoptic-scale lower-level convergence may affect the probability of convective 
plumes forming, but once started, they are self-limiting. 

Starting from this simple assumption, BKOO obtained the formula 

^ = C(RAS)- 1 ' 2 , (2.6) 

H 

where 

5 = t A i/ A (2- 7 ) 

1=1 

is the “effective” number of full area sweeps of the grid box A by the satellite instru- 
ment swaths, and the prefactor C depends only weakly on a variety of rainfall charac- 
teristics consistent with a given value of the mean rain rate R, as described below. Ar- 
guments for a l/v^R-dependence of relative sampling error on rain rate like that in Eq. 
(2.6) were given in BK96, who noted some evidence for it when estimates from simu- 
lations with radar data over southern coastal Japan (Oki and Sumi 1994) were plotted 
versus R. An extensive discussion of the dependence of sampling error on rain rate R is 
given by Huffman (1997). Quartly et al. (1999) provide a clear review of arguments for 
(2.6) and an example of an interesting application of these ideas to a rain climatology 
developed with data from the TOPEX/POSEIDON satellite dual- frequency altimeter. 

Numerous estimates of rms sampling error have been made in the literature using 
simulated satellite sampling of data taken by ground-based measurement systems in a 
variety of geographical regions. BKOO examined many of these estimates, and found 
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that the dependence of < 7g/ R on R was predicted quite well by (2.6) in those regions 

where data were available in sufficient quantities. In particular, as mentioned above, 

results of simulations with radar data over southern coastal Japan by Oki and Sumi 

(1994) agree quite well (BK96) with (2.6); and Steiner (1996) obtained error estimates 

% 

using simulations with rain-gauge and radar data from Darwin and Melbourne, Florida 
and found that he could fit the dependence of error on R with an expression quite close 
to (2.6). 

b. Model explanation 

A simple model that gives the relationship (2.6) can now be described. A more 
thorough discussion is given by BKOO. The model assumes that rainfall consists of 
individual uncorrelated rain events having, on average, area a and duration 2r a . From 
these assumptions they derived the expression 

C = (or c ) 1 /2[i _ 2r a /(T/S)} 1 ^ , (2.8) 

where r c = Rjp is the mean nonzero rain rate in a satellite footprint (subscript c for 
“conditional”), p being the probability that a footprint contains nonzero rain. The ratio 
T /S can be thought of as the average time interval between two consecutive full area 
observations by the satellite. When the sampling is sparse, one has T/S 2r a , and 
in this limit C « \Jar c . When the effective sampling interval is comparable to r a , 
this simple cell model is no longer applicable, and one must employ a more accurate 
representation of the statistical properties of the local rain field, an example of which 
is described next. 

A somewhat different explicit form of the constant C was derived by Bell et al. 
(1990) using an approach originally due to Laughlin (1981). Assuming that the entire 
area A is sampled at regular intervals At = T/S, they obtained the formula 

« {^ 2 A/S)f{Atf2r A ) . ( 2 - 9 ) 
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where <j\ is the variance of the instantaneous rain rate , 


o\ = vax[R A (t)\ , (2.10) 

t , 4 is the corresponding correlation time [( 1 /e)- folding time of the autocorrelation of 
1 ?a(£)> assumed to be pure exponential], and with 

f{y) = coth v - l/v. ( 2 - 11 ) 

The approximation (2.9) assumes T ta, which is certainly valid when T is of the 
order of 1 month, since ta is typically 4-10 h. The variance of the box-averaged rain 
rate <j\ can, in turn, be expressed in the form 

o\ = s 2 A 2 /A , (2.12) 

where s 2 is the variance of the instantaneous rain rate averaged over a satellite 
footprint. The quantity A 2 is the effective area of a rain fluctuation that can be 
considered as statistically independent of other such fluctuations within the grid box 
A, in analogy with the definition of an “effectively independent sample size” by Leith 
(1973). It is given by 

A No No 

A 2 = -jvjEE />(!*-*, I), (213) 

where p(z) denotes the spatial correlation between rain in two footprints separated by 
a distance z, Nq is the total number of footprints in A, and the average is performed 
over all pairs of footprints. The length A can be thought of as the distance over which 
footprint- averaged rain rate is correlated, or as the typical size of a coherent rain event. 
Note that the value of A may in principle vary with both FOV size [which affects p{z )] 
and the area A, which affects the range of separations |xj— Xj| encountered in the double 
sum. 

Combining equations (2.9) and (2.12), using the relations R = pr c , and 

s 2 =ps 2 +p(l -p)r 2 

« p( s l + r c ) 
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(2.14) 

(2.15) 


(the latter valid when p is small), one again obtains formula (2.6) for the sampling error, 
with the identification 

C = A{r c (l+P c )} l ' 2 f(At/2T A ), (2.16) 

where r c and s 2 are the mean and variance of nonzero rain rate (i.e., conditional on 
Rfov > 0), and p c = s c /r c . It should be pointed out that although the quantities 
Pi T ci s c , and A may each depend strongly on the footprint size, our simple theory leads 
to the expectation that expressions (2.8) or (2.16) determining the constant C are 
insensitive, to it. Short et al. (1993) have suggested that the ratio /x c = s c /r c is relatively 
constant over a range of footprint sizes, averaging times, types of data (rain-gauge or 
radar) and climates. In the limit of sparse sampling this would imply 

C « const x r*/ 2 A , (2.17) 

which should be compared to (2.8). Note that unless A is much larger than a typical 
rain event, A 2 in (2.13) will depend non-trivially on A, and thereby change the A 
dependence of <je hi (2-6). In fact, when A approaches the size of a single footprint, it 
is easy to see from (2.13) that A 2 « A. 

3. Random error of monthly SSM/I rain rates 

Rain estimates made from SSM/I observations provide a way of testing directly 
the validity of the proposed simple theory of sampling error. Coverage by the SSM/I 
as measured by S in (2.7) is quite close to that of TRMM’s passive microwave sensor 
(TMI) for grid boxes at low latitudes, and so sampling errors should be similar in size, 
though their respective retrieval errors may differ. In this section we shall investigate 
the statistical behavior of the retrieved rain rates and the inferred statistics of random 
errors in gridded monthly averages of retrievals. 
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a. The SSM/I dataset 

The dataset we used consists of rain data from two satellites, the F10 and FI 1, in 
nearly sun-synchronous polar orbits around the earth. The data were taken during the 
four-month Special Observing Period (SOP) of TOGA COARE from November 1992 to 
February 1993. Local visit times of the F10 and Fll during the SOP were roughly 9:30 
am/pm and 5:30 am/pm respectively. The SSM/I on each satellite views a given spot 
on the earth an average of about 30 times per month, so that S « 30 in Eq. (2.6). (For 
the TRMM microwave instrument, S ss 30 as well, but local visit times shift over the 
course of a month.) 

Rain rates were derived using the Goddard Profiling Algorithm, which is based 
on the method described by Kummerow and Giglio ( 1994a, b), modified following the 
description given by Kummerow et al. (1996). The dataset was generated as part of the 
3rd Algorithm Intercomparison Project (AIP-3), as described by Ebert et al. (1996), 
and in more detail by Ebert and Manton (1998). Rain rates are estimated for footprints 
which may be thought of as circles approximately 28 km in diameter, even though in 
reality they are elliptical in shape, the response of the microwave antenna is nonuniform 
over the FOV, and there is blurring due to the finite integration time of the SSM/I 
instruments. Kummerow and Giglio (1994b) provide a more detailed discussion of 
this topic. The retrieved rain rates are provided as successive arcs each containing 64 
partially overlapping footprints and covering altogether a swath about 1400 km wide. 

We study the statistics of rain in the region extending from 10° S to 10° N and 
from 135° E to 175° E in the tropical western Pacific. This region includes the area 
where the TOGA COARE Intensive Flux Array (IFA) was located. For an optimal 
choice of the grid-box size for our statistical analysis one needs to strike a compromise 
among several competing factors. The box needs to be large enough so that rain rates 
in neighboring boxes can be assumed to be statistically uncorrelated. This is essential 
for treating collections of grid-box averages as sets of statistically independent samples, 
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so that standard statistical methods of estimating confidence intervals for the averages 
can be used. On the other hand one would like the boxes to be small so that there 
are as many boxes as possible, thereby giving us a more detailed, smoother picture of 
the dependence of the retrieval statistics on local rain rate, as will be clear in the next 
section. A small box size also increases the likelihood that rain statistics within the 
box can be regarded as approximately homogenous. With these factors in mind we have 
chosen a grid-box size A = 2.5° x 2.5°. 

b. Estimate of the random error in grid-box averages 

The SSM/I dataset itself does not provide access to the true monthly average 
rain rate R appearing in the definition of &e in Eq. (2.5), To circumvent this diffi culty, 
we use a procedure suggested by Chang et al. (1993) to estimate the rms random error 
cr e for either satellite. Consider the mean squared difference between the F10 and Fll 
estimates of a grid-box monthly average: 

<(£io - Rn?) = <[(Aio - R) - (Rn - R)f) 

83 ((Rio — R ) 2 ) + ((-Rn — R?) 

53 2<jjj . (3.1) 

The approximation above would be legitimate if the observations by the two 
satellites are far enough apart in time to be nearly uncorrelated. Although the legiti- 
macy of this assumption may be surprising, since the satellites can in principle view the 
same scene only 4-5 hours apart, several factors appear to justify the approximation. 
Each satellite visits a grid box only once per day on average, and the visits of one satel- 
lite are generally well separated from the other’s. Moreover, some simple calculations 
based on Laughlin’s (1981) approach show that, for two satellites with idealized sam- 
pling like that of the F10 and Fll, expression (3.1) is quite accurate, even though the 
two averages Riq and Rn are not in fact statistically independent. It should be noted, 
however, that the same calculation indicates that the approximation (3.1) is not so good 
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if the satellites were to have closer sampling times or, more surprisingly, respective visit 
times nearer to 12 hours apart. Finally, this approximation was corroborated by per- 
forming sampling error calculations using the method developed in BK96 and the exact 
sampling patterns of the F10 and Fll satellites, and the approximation (3.1) is borne 
out at the level of 5% accuracy. 

The error variance a\ as estimated in (3.1) includes both the sampling error 
described in section 2 and also any contributions from randomly varying retrieval errors 
in the two satellites’ estimates. As discussed in the previous section, if random retrieval 
errors are uncorrelated from footprint to footprint, the contribution to erg from these 
errors should be quite small, and the sampling-error component would dominate the 
estimate of er\ based on (3.1). If, however, retrieval errors axe correlated spatially or 
from one satellite viewing to the next, <j\ may contain significant contributions from 
retrieval error. For any of the purposes reviewed in the introduction, though, the error 
a\ introduced there is more properly given by er\ rather than the sampling error for 
perfectly measured rain rates. The fact that the estimate a\ includes retrieval error 
as well as sampling error is therefore an advantage rather than a disadvantage to an 
approach using (3.1). 

Systematic differences in the rain retrievals by the two satellites, if present, could 
also contribute to the estimates of cr\ made with (3.1). Such differences might be due to 
instrumental biases or operational differences between the two systems, or to significant 
diurnal variation in the rain statistics for the grid box. The diurnal variation would, 
however, have to be more complex than a simple first-harmonic sinusoid in order to 
contribute in this way, since each satellite views grid boxes at two times of the day, 
twelve hours apart, on average. Differences in the F10 and Fll averages due to diurnal 
effects seem unlikely to be very large for grid boxes over oceanic regions, but could be 
appreciable for boxes containing significant amounts of land. 
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c. Statistical analysis of the data 

Monthly averages of retrieved rain were obtained for each 2.5° x 2.5° grid box 
in the TOGA COARE SOP dataset described above, yielding a total of 512 samples 
(128 grid boxes, 4 months of data). Grid-box results were also segregated according 
to whether the grid boxes contain mostly land, mostly ocean, or a mix ture, but the 
differences in the statistics for these subsets were, for the most part, difficult to discern. 
They will be discussed later. 

The coverage provided by the two satellites can vary from grid box to grid box 
and month to month. To gauge this, let us define Siq and Sn as the effective numbers 
of full viewings of a grid box by the F 10 and FI 1, respectively, as measured by (2.7). To 
compute Siq and Sn, a method is needed for estimating the areal fraction Ai/A for each 
satellite visit i. 

(i) Estimation of Siq and Su. 

If the number of footprints required to cover the entire area A is known, the ratio 
of the actual number of footprints in A to the full-coverage number provides an estimate 
of the fraction Ai/A for that particular visit. A possible method of determining the full- 
coverage footprint number is to examine the distribution of the number of footprints 
observed in many overflights of a grid box. Since the SSM/I swath is wide compared to 
the grid-box size, we would expect a histogram of the number of footprints observed in 
a box to peak at the maximum possible number. In reality, such histograms are not so 
simply behaved. This is in part because the density of footprints varies with location 
in the instrument swath, being largest near the swath’s edges. Sporadic data loss due 
to instrumental and algorithmic problems can also occur. As a result, the histogram of 
footprint counts displays a somewhat broadened peak at the largest footprint counts. 
Although a more exact method of determining the fractions Ai/A could certainly be 
devised, it is sufficient for our purposes to define the full-coverage footprint count as the 
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number of counts N max where the histogram peaks. We estimate the fraction Ai/A for 
a given visit to a grid box to be the ratio of the actual footprint count to N max . This 
estimate can sometimes be greater or less than 1 even though the swath completely 

covers the grid box, but the monthly sums Sio and Sn that result from this choice 

* 

are reasonably good approximations to the values that would be obtained from more 
geometric estimates, and in addition take account of occasional data dropouts. For 
the SSM/I dataset we found N max & 120. Values of 5io and -Sn computed this way 
for the 512 cases ranged between 15 and 34, with a mean value of about 28, indicating 
considerable variations in the satellite sampling. (It should be noted that the number of 
days available in the months also varies.) 

(ii) Removing effects of variable coverage. 

Since our chief concern here is with how well (2.6) predicts the dependence of 
cte on local rain rate, it would be preferable if we could minimize the effects on our 
analysis of the varying coverage by the satellites. Arguments very similar to those used 
in deriving (2.6) predict 

{fo-AriVC* 5 (£ + £), (3.2) 

where Sio and Sn are the effective numbers of full viewings of a grid box by the F10 
and Fll, respectively, as measured by (2.7). By defining a “mean” coverage S for the 
two satellites by 

2/S = 1/Sio + 1/Su , (3-3) 

we can recast Eq. (3.2) in a form identical to Eq. (2.6) even if the relative coverage 
by the two satellites varies. As in Eq. (2.6), the coefficient C in (3.2) may depend 
on local rain statistics in ways suggested by Eqs. (2.8) or (2.16), but it should be 
relatively insensitive to changes in coverages 5io or Sn. (It should be noted that the 
rain statistics determining C are now those of the “measured” rain, including the effects 
of randomly varying retrieval error.) 
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Consider the result of multiplying Eq. (3.2) by S/2, 

S((R 10 - R n ) 2 )/2 = C 2 R/A . 


(3.4) 


The ensemble average (•) here indicates an average over many different sequences of rain 
events all having the same monthly -mean R and observed by the two satellites. Since 
changes in S have relatively little effect on the right-hand side of (3.4), the left-hand 
side will be insensitive to changes in 5 as well. This allows us to obtain estimates of the 
right-hand side of Eq. (3.4) from averages of data with differing values of S, so that we 
can write 

C 2 R/A = (S(Riq - R n ) 2 )/2 . (3.5) 

where now the angular brackets are meant to indicate an average over an ensemble 
of months with varying rain sequences with monthly average R and varying satellite 
sampling as measured by S. 

(iii) Dependence of RMS error on R. 

Guided by Eq. (3.5), then, we investigate the dependence of o\ on rain rate by 
first computing the mean rain rate 

R = {Rio + R\i)/2 (3-6) 

for each of the 512 grid boxes and months. The 512 pairs of estimates from the F10 
and Fll are sorted into 8 bins in order of increasing values of R, with 64 samples to 
a bin. For each bin, an average over the 64 values of S(Riq - Ru ) 2 / 2 gives us an 
estimate of C 2 R/A, using (3.5), at the mean R for that bin. The binning process 
destroys information regarding the geographical location of a particular box and the 
observation month — samples containing similar monthly averaged rain rates are lumped 
together regardless of their location or time of observation. Although rain statistics no 
doubt change as various factors affecting the formation and development of precipitating 
systems within each grid box change, the operating assumption is the same as that of 
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the simple model: that if the frequency with which rain events occur in a grid box is 
known, ail other rain statistics at that location can be predicted reasonably well. 

It was mentioned in section 2 that sampling errors for monthly averaged TRMM 
TMI data have been estimated with simulations using ground-based measurements in a 
variety of rain environments. Near the equator the TMI and a DMSP satellite carrying 
SSM/I provide almost identical coverage, as measured by S, if both instruments are 
providing rain estimates from the entire instrument swaths during the month. With 
perfect coverage, S ~ 30 for both satellites. In order to compare our SSM/I results to 
these earlier TRMM studies, Eq. (2.6) and our estimates of C^R/A from (3.5) can be 
used to compute what the random error in monthly averages of SSM/I data would 
be for the same coverage Sq = 30 assumed in the TRMM studies, via 

(S(Rio - Rn) 2 )/2 

So 

Figure 1 shows a plot of <Je/R estimated for a single SSM/I providing maximum 
possible coverage during a month (i.e., assuming an average of 30 visits per month). 
Results are plotted versus the average R for each bin. Error bars are 95% confidence 
limits obtained under the assumption that differences in monthly means behave 
statistically like independent, normally distributed variables. 

Also shown in Fig. 1 are sampling-error estimates based on two radar datasets 
collected from ships stationed over open ocean. The two estimates labeled “GATE” 
use the statistics of data taken over the eastern tropical Atlantic during Phases I and 
II of the Global- Atmospheric-Research-Program Atlantic Tropical Experiment (GATE) 
in 1974. The six estimates labeled “TOGA COARE” use the statistics of radar data 
from two ships during the three cruises of the SOP. The methods used in obtaining 
these estimates are described fully in BK00. Comparison of the SSM/I estimates with 
the TOGA COARE estimates is particularly appropriate because the data were taken 
during the same four months, although the radar data cover only a limited region 
around 2°S, 156°E. 
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Figure 1. Relative sampling error of monthly grid-box averages over the equatorial 
western Pacific as a function of mean local rain rate R. SSM/I estimates have been 
corrected for missing data. A power-law fit is shown. Estimates using surface radar 
data assume coverage identical to what is provided by the TRMM microwave instrument, 
averaging 30 visits per month, very close to the SSM/I sampling. GATE radar data were 
taken during 1974 . TOGA CO ARE radar data were taken contemporaneously with the 
SSM/I data. 


Figure 1 brings out two salient characteristics of the SSM/I error estimates: 1) 
Estimated errors in SSM/I averages, which may include random retrieval errors, are 
30% or more of monthly mean rain rates, and considerably larger than previous error 
estimates based on surface radar data, which are nominally estimates of sampling error 
alone (but could include the effects of errors in the radar-derived rain rates); and 2) 
even though both the simple model and experience (though admittedly limited) with 
ground-based data suggest that a e/R might be described by a power law with exponent 
— V2, the SSM/I errors are better described by a power law with an exponent of about 
-0.3. 

It should be noted that a number of sampling error estimates have been made 
with ground-based data other than those shown in Fig. 1. They are reviewed by BK00. 
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Two extensive studies, by Oki and Sumi (1994) and by Steiner (1996), yielded sampling- 
error estimates that are comparable in magnitude to the SSM/I values in Fig. 1, except 
at the highest rain rates, where the SSM/I estimates are larger. Because these studies 
used data from southern coastal Japan and from Darwin, on the northern coast of 
Australia, however, it is not clear that comparison with the SSM/I results is appropriate 
here. Rain in tropical coastal areas is quite different in character from rain over the 
open ocean. The SSM/I statistics we used are largely determined by rain over oceanic 
areas. The TOGA COARE radar statistics shown in Fig. 1 axe from an area and time 
period included in the SSM/I dataset, and so would be most nearly comparable. 

It is interesting to note that Chang et al. (1993) also obtained rms error as a 
function of the mean rain rate on a 5° x 5° grid, using global oceanic monthly estimates 
of rainfall obtained with their microwave emission-based algorithm. If their results are 
converted to the format used here, they can be fitted to cte/R ~ 0.26 R~ 0 26 (R in 
mm h -1 ). The relative errors they found are roughly 50% higher than the corresponding 
errors for 5° x 5° boxes we found (not shown) using the SSM/I dataset studied here. 

We conjecture that, because the grid boxes in Chang et aids (1993) study were all 5° x 
5° regardless of location, boxes at higher latitudes that contributed to their statistics 
had smaller physical areas, and Eq. (2.6) predicts that they would have higher rms 
errors than for boxes near the equator. Thus, the higher errors of extra- tropical grid 
boxes may have been averaged with the errors for tropical grid boxes and resulted in an 
overall increase in average error, whereas our analysis covers only equatorial areas. 

Figure 1 has shown that, where they can be compared, the statistics of the 
microwave-retrieved rain rates clearly differ in important ways from the statistics 
of surface radar data. In the sections that follow we shall try to identify where the 
differences occur, propose some useful diagnostics for these differences, and suggest how 
Eq. (2.6) might be modified to take them into account. 
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4. Exploration of ground-radar-SSM/I differences 

The assumptions of the simple model in section 2 lead to predictions for sampling 
error like Eq. (2.9), where mean squared error is the product of the variance of area- 
averaged rain rate, <j\, and a factor f(At/2r^)/S determined by the temporal sampling 
pattern of the satellite and by the correlation time t\ of area-averaged rain rate. We 
can rewrite it somewhat schematically as 

4 * c\f(T/2.T A S)/S . (4.1) 

In reality, when satellite visits are not evenly spaced and the area A is not viewed in its 
entirety on each visit, the dependence of f/S on a satellite’s sampling pattern is more 
complicated than the simple dependence on S in (4.1) suggests. Based on an earlier 
study (BK96) with TRMM sampling, however, Eq. (4.1) seems to capture much of the 
change in sampling error with satellite sampling. 

As we shall see later, the correlation times of SSM/I-retrieved rain rates tend 
to be similar in size to the correlation times seen in radar data and small compared to 
the typical time interval between SSM/I visits. We therefore conclude that the factor / 
cannot explain the differences in sampling errors in Fig. 1. Most of the difference seems 
to be due to differences in variability of area-averaged rain rate as reported by satellite 
and ground-based systems, and we turn now to investigating the differences in o\ for 
the two. 
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Figure 2. Ratio of the variance of instantaneous area-averaged rain rate Ra(1 ) to R, 

A = 2.5* x 2.5*, computed following the procedures used for Fig. 1. The simple model 
predicts that this quantity should be insensitive to local rain rate. Error bars (95% conf.) 
are shown only for GATE, but others would have similar errors. A power-law fit to the 
SSM/I points is shown. Corresponding statistics derived from TRMM TMI data are also 
plotted, and are discussed in Sec. 6. 


By combining Eqs. (2.6) and (2.9) it is easy to show that the simple model 
predicts that a\ should increase linearly with R, so that the ratio o\jR should remain 
constant with changing local rain rates. Pig. 2 shows this quantity plotted as a function 
of R using the same binning procedure as in Fig. 1. In order to improve the legibility 
of the figure, only error bars (95% confidence intervals) for the ratio computed from 
GATE radar data are shown. They are representative of the estimated errors in the 
other plotted quantities. (Also shown are corresponding values obtained from TRMM 
TMI retrievals. These will be discussed later.) Given the level of uncertainty, it could 
be argued that the surface radar statistics are consistent with the constancy with 
R predicted by the simple model, though synoptic conditions at the two radar sites 
are sufficiently different that some underlying changes in the statistics may also be 
occurring. Whether or not this is so, it is evident from Fig. 2 that variances in SSM/I 
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area averages are significantly larger than for the same averages obtained with surface 
radar, and they also appear to increase faster with R than the surface data. 
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Figure 3. The scale of “statistically independent rain events,” for SSM/I data in 2. S’ 
x jg.5° grid boxes, from, Eq. (2.13). If spatial correlations decreased exponentially as 
exp(— z/X) with separation z and the dimensions of A are large compared to X, then 
A = A/ y/2rc. See appendix for details. 

Equation (2.12) indicates that <j\ is determined by the variance of the individual 
SSM/I “point” estimates of rain rate (i.e., s 2 for FOV estimates) and by A 2 , the area 
of statistically independent rain events. Figure 3 shows the dependence of A on R, 
calculated using Eq. (2.13). The calculation of A had to be adapted to handle the 
actual spatial distribution of SSM/I footprints, and is described in the appendix. In 
this and the plots that follow, the statistics for each value of R are averages over 64 
grid-box/months with monthly means in the neighborhood of R, just as in Figs. 1 and 
2. SSM/I estimates for regions with monthly rain rates similar to those observed by 
the surface radar in TOGA COARE, R ~ 0.2 mm h _1 , yield values of A « 100 
km (corresponding to a “correlation distance” of about 40 km — see appendix). If the 
TOGA COARE radar data are smoothed to a spatial resolution corresponding to the 



- 26 - 



scale of the SSM/I footprint area, about ^(28/2) « 25 km, and used to calculate A, a 
value of A very close to the SSM/I value is obtained. It is therefore the larger values of 
s 2 for the SSM/I rather than differences in A that are mostly responsible for the larger 
values of a\ seen in Fig. 2. 



Figure 4 • Mean r c and standard deviation s c of SSM/I rain rates in FOVs with nonzero 
rain, and the ratio p c = s c !r c . 

Equation (2.15) relates values of s 2 to the average areal coverage by rain, p, and 
the mean and variance of nonzero rain rates, r c and s 2 . Figure 4 shows the conditional 
mean r c = R/p and standard deviation s c of nonzero rain seen by SSM/I, and also 
the ratio p c = s c /r c , as a function of R. The statistics are comparable in size to 
those reported for GATE data by Short et al. (1993), especially p c . The ratio p c is 
nearly constant, a phenomenon also noted by Short et al. (1993) in other rain data. 
There are, however, subtle threshold-dependent effects in the conditional statistics 
that make intercomparison of the radar and SSM/I statistics problematic. The radar 
is able to detect much smaller rain rates than the SSM/I. When values of r c , s c , and 
p c are calculated from surface TOGA COARE radar data smoothed to a spatial 
resolution corresponding to the scale of an SSM/I FOV [« (25 km) 2 ], we find values 
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r c = 0.5 mm h s c — 1.4 mm h l , and p, c = 2.7 ± 0.3. They are quite different from 

the satellite values. For example, we see in Fig. 4 that for the SSM/I data /x c ranges 

between 1.21 and 1.44. The difference in the values of p c obtained by us from TOGA 

COARE radar data and the values obtained from SSM/I data and in the analyses of 

1 

surface data by others suggests that p c may depend on the threshold of detectability of 
rain in a way that was fortuitously absent in other studies. 



Figure 5. Autocorrelation of SSM/I rain rate averaged over 2.5* x 2.5* grid boxes for 
various categories of monthly rain rate R. Correlations are shown only when more than 
about 400 pairs of observations are available at a given separation t. Curves through 
data points are smoothed interpolations. 


In order to study temporal correlations of area-averaged SSM/I rain estimates, a 
time series of the average rain rate for full-area observations at each grid-box location 
was obtained. All visits with greater than about 85% coverage, determined from 
the footprint counts as explained in section 3.c.i, were included to get a time series 
that is sufficiently dense. Because the visit times of the F10 and Fll sometimes 
differed by as little as 3 h, these series had sufficient time resolution for useful time 
correlations to be obtained. Figure 5 shows the lagged autocorrelations of Ra(1) sorted 
into the same 8 climatological rain-rate bins used in the previous figures; that is, 
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autocorrelations for a given R represent the statistics of 64 time series with monthly 
means in the neighborhood of R. For each of the 8 rain-rate categories we fitted the 
lagged autocorrelation function of the area-averaged rain rate to a simple exponential 
form exp (— \t — t'\/r^). The correlation times Tj\ were found to be about 6 hours and 
nearly independent of R, except at the lowest and highest rain rates. Spectral analysis 
of the time series indicated enhanced spectral power at frequencies corresponding 
to periods of 2-5 days and 40-50 days. The former may possibly be related to the 
convective disturbances with that time scale discussed by Takayabu and Nitta (1993), 
while the latter may be related to the Madden-Julian oscillation (Madden and Julian 
1972; Chen and Yanai 2000). 

It is well known that the statistical behavior of rainfall differs over land and 
ocean. To investigate this quantitatively, we employed a land/ocean mask at 2.5° spatial 
resolution. Of the 128 grid boxes in the chosen area, 97 are categorized as covered 
by ocean, 23 as mostly covered by land — largely concentrated around New Guinea 
in the southwest quadrant of the area we studied — and 8 as containing substantial 
amounts of both. The statistics of land-containing grid boxes were sorted into only 4 
bins with increasing rain rates R in order to have a reasonable number of samples in 
each bin. Monthly rain rates in the land-containing boxes tended to range over values 
less than half as large as for the ocean-covered boxes. Most land-ocean differences in 
the statistics were indistinguishable from variability caused by small-sample effects. The 
conditional means r c , however, were 50% to 75% larger over land, unlike the values of 
s c , which were, perhaps surprisingly, a little smaller. The ratio fi c ranged from 1.43 
to 1.56 over ocean and from 0.85 to 1.0 over land. A pronounced peak in spectral 
power was found in the time spectrum of rain over land-covered boxes at a frequency 
of 1 day -1 , indicating the presence of a strong diurnal cycle. No spectral peak was 
evident in oceanic rain rates at that frequency. There is also little sign of any enhanced 
autocorrelation at r = 24 h in Fig. 5, except perhaps for grid boxes with the smallest 
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rain rates, indicating that statistics tended to be dominated by the statistics of the 
oceanic grid boxes. 

5. Power-law descriptions of SSM/I statistics 

The statistics of the SSM/I retrieved rain rates are described quite well by simple 
power-law dependences on R, as can be seen from the power- law fits shown in Figs. 1-4. 
Since this provides a much more concise description of the statistics, we present these 
results here. 

It is convenient to express the various statistical quantities as powers of the 
dimensionless quantity p rather than R. We introduce the three basic exponents a, /3 
and 7 through the relations 

r c = r 0 p a , s? = SojA A 2 = A . (5.1) 

Note that in the simple model all the exponents would vanish. From the definition 
R = pr c it follows that 

R = r 0 p 1+a , (5.2) 

and if we treat the ratio p c as approximately constant, Eq. (2.15) gives 

s 2 » (s 2 + ro)p 1+0 . (5.3) 

(Strict constancy of p c would imply ft = 2a.) 

The expression (2.12) for <j\ implies the power-law relation 

= (*o + r o)( A o/A)p 1+/3+7 . (5.4) 

Because p and R are related by (5.2), the exponents a, (3, and 7 can be derived from 
the exponents obtained with error- weighted least-squares power-law fits to the statistics 
in Figs. 2-4. We find that the SSM/I statistics can be reasonably well explained by the 
values 

a = 0.17, 0 = 0.53, 7 = 0.53. (5.5) 
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The coefficients of the power-law fits in Eqs. (5.1) were found to be 

ro — 4.5 mm h -1 , sq = 7.6 mm h -1 , Aq = 220 km . 


(5.6) 


Although not shown here, it was found that the right-hand side of Eq. (3.5) is 
proportional to to reasonably good accuracy; that is, the dependence of a e on R is 
mostly determined by the /^-dependence of a a, as predicted in (4.1). Their relationship 
is described empirically by 

<r% = 0.66 cr^/5 . (5.7) 


It is interesting to compare the empirical coefficient in (5.7) with what would be 
estimated from Eqs. (2.9) and (2.11). If we use the correlation time ta — 6h found 
in section 4 for most rain rates R, and the mean monthly areal coverage S = 28, we 
calculate f(T/2rAS) = 0.56. Given the crude nature of the estimate, which assumes 
exponential autocorrelation of Ra{1) and equally spaced observations in time by the 
satellite, the extent of agreement with the observed value 0.66 is remarkable. 

If the relatively small effects on sampling error <7e due to changes in ta with R 
are neglected, Eq. (4.1) implies 

a\ oc p 1+ 0 + 7. (5.8) 


The relative sampling error for a single SSM/I satellite shown in Fig. 1, when fitted to a 
power law in R, 

c te/RocR 5 , (5.9) 


gives an exponent 5 = —0.30 (instead of -0.5 predicted by the simple model). The 
power laws (5.1) would predict 5 = —1/2 + (ft + 7 — ot)/ 2(1 + a), or 5 = —0.12 
when the exponents in (5.5) are substituted. The discrepancy in the exponent obtained 
by directly fitting (te/R to a power law and the exponent predicted using the other 
empirical exponents appears to be due to the changes in the correlation time of Ra( t) at 
the smelliest and largest rain rates R seen in Fig. 5. The resulting changes in the factor 
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/ in (4.1) are equivalent to an increase in the predicted value of 5 from -0.12 to a value 
near -0.3. 

6. Some Preliminary TRMM Results 

The analysis so far described was motivated in part by the need to supply a 
measure of the random error for gridded monthly rain-rate products produced by 
TRMM. Prom a rainfall-retrieval-algorithm point of view, the TRMM’s TMI has an 
advantage over the SSM/I because the TRMM satellite orbits closer to the earth, giving 
the instruments improved spatial resolution, and the TMI includes a lower-frequency 
dual-polarization 10.7-GHz channel in addition to SSM/I’s four higher- frequency 
channels. Although the random error in TRMM monthly rain climatologies will be more 
thoroughly explored in a subsequent paper, it is interesting to compare the performance 
of TRMM to what has been learned about SSM/I here. 

a. TRMM Data 

We used TMI surface rainfall retrievals made available by the Goddard Space 
Flight Center (GSFC) Distributed Active Archive Center (DAAC) as official TRMM 
product 2A12, version 4, for the four-month period January-April 1998 over the same 
geographical area as the one used in the SSM/I study here. The TMI rain product has 
benefited not only from the instrumental advantages mentioned above, but also from the 
use of a version of the algorithm more advanced than the one used with the SSM/I data. 
The most important change in the algorithm is probably the addition of a step which 
adjusts for the relative amounts of convective and stratiform rain present in each FOV, 
as described by Hong et al. (1999). 

b. Data Analysis Results 

The dependence of the statistics of TMI rain-rate data on local rain rate R was 
determined in the same manner as before, by binning the statistics for each 2.5° x 2.5° 
grid box and month according to the monthly mean R. A plot of cr\/R for TRMM 
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is shown in Fig. 2. The number of bins was increased to 16 when it became apparent 
that the statistics change in character above and below R ~ 0.1 mm h , so that each 
point represents an average of 32 rather than 64 grid-box results. It is encouraging to 
see that the TRMM statistic has moved closer to the radar values. The improvement is 
especially marked at the higher rain rates, where the ratio is both more nearly constant 
with R and considerably lower than the SSM/I results. 

Rather good fits of the TRMM results to power laws in R can be obtained if 
a fairly sharp crossover of the exponent values for rain rates above and below R = 

0.1 mm h -1 is allowed. The parameters of the fits in both regimes are given in Table 
1- The parameters for the conditional rain statistics for TRMM are very different from 
those of the SSM/I statistics given in (5.5) and (5.6). 

7. Summary and Conclusions 

SSM/I rain-rate data taken during the TOGA COARJE experiment were used 
to estimate the random error in monthly averages over 2.5° grid boxes in the western 
tropical Pacific. The satellite algorithm that was used is a predecessor of the one 
currently used to process TRMM microwave data. The error estimates were made 
two different ways: one estimate was obtained from the rms differences of the monthly 
averaged rain rates given by the F10 and Fll satellites; a second estimate was obtained 
from the variance, a\, of instantaneous area- averaged rain rates Ra(^), and a rough 
estimate of the temporal correlations of RA(t). The two estimates agreed quite well. 
This suggests that reasonable estimates of random error in gridded monthly averages 
might be made from a\ and an approximate characterization of the time correlations 
of Ra{1 ) — quantities that can be obtained from the satellite data themselves. Such 
estimates will include the contributions of random retrieval errors to the total error. 

Over the ocean, both the magnitude of a a/R and its dependence on local rain 
rate R are clearly different for the SSM/I rain estimates and surface radar estimates. 
The higher variance of SSM/I estimates of Ra ( t) compared to radar appears to be due 
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mostly to the larger variance of individual footprint estimates, measured by s 2 , rather 

than greater spatial correlations of the rain data — to the extent they are measured by 

A. It will be shown in a separate paper that the SSM/I estimates are highly correlated 

with stratiform rain as identified in the TOGA COARE surface radar data, and not 

% 

so well correlated with rain identified as convective; the SSM/I rain estimates where 
there is stratiform rain are much larger than the corresponding radar estimates, whereas 
rain estimates where the radar reports convective rain tend to be estimated smaller by 
SSM/I. The net effect is to make s 2 large for SSM/I FOV estimates. These conclusions 
apply, of course, only to the rain data generated by the particular algorithm used to 
produce the dataset investigated here. 

Little has been said here about how sampling error depends on the grid-box 
area A. As was seen in Eq. (2.6), the simple model would predict as oc A -1 / 2 . 
Equations (2.9) and (2.12), however, indicate that this is only true if the area A is much 
larger than A 2 . The 2.5° x 2.5° boxes studied here are not quite large enough in this 
respect. Although increasing the box size to 5° x 5° reduces the number of samples 
per bin when the statistics are binned by rain rate R, as was done in section 3, such 
an experiment shows that the power-law dependence of a\ on R is almost the same for 
the two box sizes, but that the dependence of <j\ on A is consistent with a a oc A -0 - 33 
rather than with A -1 / 2 . Thus, increasing the box size from 2.5° to 5° does not decrease 
sampling error as much as the simple model would have predicted if A were larger. 

Based on our results, it is recommended that future algorithm intercomparison 
projects include comparisons of cr\/R for grid-box sizes of the order of 2.5° or larger, in 
addition to comparing the mean rain rates R themselves. The ratio is easy to calculate 
and, as has been shown here, can serve to bring out some aspects of the algorithms 
that can be missed in point-by-point comparisons but are important for climatological 
use of the data. This quantity has the advantage that, other things being equal, it is 
not so sensitive to instrument resolution, and so makes intercomparison of different 
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measurement systems conceptually easier. The quantity <7 2 can reveal the presence of 
correlated retrieval errors in the satellite product, a possible byproduct of the reason 
for its being larger in the SSM/I data than in the radar data, as will be discussed in a 
subsequent paper. 

1 

An especially important result is that the quantity can be used to estimate 
the accuracy of monthly averages of rain data via a relation like Eq. (5.7). Such an 
estimate avoids some of the assumptions used in parameterizing error in terms of 
average rain rate R, though it requires that the satellite dataset supply values of <r _4 as 
well as R for each grid box. 

Whether because of better resolution and additional channels in the TMI or 
because of improvements in algorithms, the statistics of TRMM TMI (version 4) rain 
estimates from the western tropical Pacific appear to be significantly closer to oceanic 
surface radar statistics than the SSM/I statistics. An improved TMI algorithm is now 
being used to process TMI data, and we expect even better agreement with ground- 
based data. This will be examined in a future paper. 
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APPENDIX 

Computation of the Length Scale A 

In this appendix we discuss in more detail the computation and interpretation 
of A 2 defined in Eq. (2.13). It is helpful in developing an interpretation of A to assume 
that the footprints are sufficiently densely and evenly distributed that they can be 
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treated as if arranged in a regular rectangular array completely filling the area A = L 2 . 
Each footprint occupies a box of side d = L/N. The number of footprints is then 
No = N 2 . The quantity A so defined in general depends both on the area size L and 
the footprint size d. For instance, if the area is small enough to be covered by a single 
footprint, then obviously A = L. More generally, however, A is closely related to the 
scale over which the data are spatially correlated, as we now show. 

Using the identity 

EE/( i- j)= S (N - |m|)/(m) (Al) 

1— 1 j = l 771= — JV 

for a function f(i ) defined at each integer i, |i| < N — 1, we can write (2.13) as 
„ A N N 

A = jji E E p(|xij -x«d 

JV i J=1 Ar,/=1 

4 E Z (iV-ImjIKAT-ImjDpdmlrf) 

mi=-N m2——N 

(A.2) 

This formally transforms the sum over the correlation between all pairs of footprints in 
the N x N array into a weighted sum of the correlation between each footprint in an 
equally spaced (2 N + 1) x (2 N + 1) array and a footprint located at the center of the 
array. If p(\m\d) is sufficiently smooth, Eq. (A.2) can be treated as a discrete numerical 
approximation to a continuous double integral. The approximation becomes exact in the 
limit d —+ 0 (“point footprint”). Introducing the separation vector s = md, and using 
the relations A = L 2 and L = Nd we can express A 2 in this limit as an area integral 
over a 2L x 2L square: 

a 2 « ^ j_ L dsi S_ L ds2< ^ L ~ NX^- M)p(|s|) • 

By going to polar coordinates this can be reduced further to the one-dimensional 
integral 

n /-V5i 

A 2 = 4 / sg(s)p(s)ds (A3) 

Jo 
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with the angular integral replaced by the areal weighting factor 

ffil(s) 


9 (a) = J <bp( 1 - jCOS¥?)(l - ^sin ip), 


where 


and 


¥>o(s) 


0, s < L; 

cos ~ 1 (L/s), s> L] 


(A4) 


9 ?i(s) = tt/ 2 - v? 0 (s) . 

Carrying out the integrations in (A4) we get 


f 7r/2 — 2s/L + s 2 /(2L 2 ), 0 < s < L; 

9(s) = < 

[ tt/2 - 1 - 2 cos _1 (I/s) + 2^(s 2 /L 2 - 1) - s 2 /(2L 2 ), L<s< V2 L . 

When the footprints are small compared with spatial correlation lengths and the 

grid-box size A is large, one can easily show that 

A 2 - 2tt L? nt , (A5) 

where 

r OO 

Ant = / sp(s)ds (A6) 

is an “integral correlation length” which is just the usual correlation length, the (1/e)- 
folding distance, if the correlation p(s) decreases exponentially. 

Although the continuous integral representation of A 2 given by Eq. (A3) in the 
limit of infinite resolution is conceptually illuminating, estimation of the integral from 
the finite resolution data in practice takes one back to a discrete sum. We estimated 
A 2 for each 2.5° grid-box area as follows: The footprint pairs are binned according to 
their mutual distance of separation in units of d/2 where d is the nominal diameter of 
an SSM/I footprint (about 28 km). For all the pairs belonging to the k- th separation 
bin ( k = 0, 1, 2, ... , km#* = [2y/2L/d\, where [x] denotes the integer part of x) we 
compute the correlation coefficient p^, the mean separation Sk and the angular factor 
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9k — g(kd/2). In terms of these quantities a reasonably accurate estimate of A 2 is given 
by the Riemann-sum approximation 

&max ^ 

4 5Z n(Pk+lSk+l9k+l ~ PkSk9k)(s k +l ~ S k ). 
k= 0 z 

This method of proceeding does not require the assumption that the footprints be 
uniformly distributed in the area A that was used to develop the interpretation (A3) for 
A. We have tested the accuracy of the approximation by plotting s 2 h 2 /A against o\. 
Our results are closely fitted by a straight line with unit slope. 
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Figure Captions 


FIG. 1. Relative sampling error of monthly grid-box averages over the equatorial 
western Pacific as a function of mean local rain rate R. SSM/I estimates have been 
corrected for missing data. A power-law fit is shown. Estimates using surface radar data 
assume coverage identical to what is provided by the TRMM microwave ins trument, 
averaging 30 visits per month, very close to the SSM/I sampling. GATE radar data 
were taken during 1974. TOGA COARE radar data were taken contemporaneously with 
the SSM/I data. 

FIG. 2. Ratio of the variance of instantaneous area-averaged rain rate Ra{^) to R, A 
= 2.5° x 2.5°, computed following the procedures used for Fig. 1. The simple model 
predicts that this quantity should be insensitive to local rain rate. Error bars (95% 
conf.) are shown only for GATE, but others would have similar errors. A power-law fit 
to the SSM/I points is shown. Corresponding statistics derived from TRMM TMI data 
are also plotted, and will be discussed in Sec. 6. 

FIG. 3. The scale of “statistically independent rain events,” for SSM/I data in 2.5° 
x 2.5° grid boxes, from Eq. (2.13). If spatial correlations decreased exponentially as 
exp(— z/X) with separation z and the dimensions of A are large compared to A, then 
A = A/\/2rr. See appendix for details. 

FIG. 4. Mean r c and standard deviation s c of SSM/I rain rates in FOVs with nonzero 
rain, and the ratio \x c = s c /r c . 

FIG. 5. Autocorrelation of SSM/I rain rate averaged over 2.5° x 2.5° grid boxes for 
various categories of monthly rain rate R. Correlations are shown only when more than 
about 400 pairs of observations are available at a given separation r. Curves through 
data points are smoothed interpolations. 
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Tables 


TABLE 1. Power-law dependence of r c , s c , and A on p, defined in Eq. (5.1), and 
power-law dependence of a e/R on R defined in Eq. (5.9), for TRMM TMI statistics 
over the western tropical Pacific. As can be seen in Fig. 2, fits to the data must be 
obtained separately for small and large R. 



a 

P 

7 

<5 

ro (mm h *) 

so (mm h x ) 

Ao (km) 

R < 0.1 mm h 1 

1.02 

3.40 

0.46 

0.03 

24.6 

379 

175 

R > 0.1 mm h -1 

1.44 

0.90 

0.44 

- 0.34 

104 

14 

165 
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